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FOREWORD 

Among  the  responsibilities  assigned  to  the  Office  of  the  Manager,  National 
Communications  System,  is  the  management  of  the  Federal  Telecommunication 
Standards  Program.  Under  this  program,  the  NCS,  with  the  assistance  of  the 
Federal  Telecommunication  Standards  Committee  identifies,  develops,  and 
coordinates  proposed  Federal  Standards  which  either  contribute  to  the 
interoperability  of  functionally  similar  Federal  telecommunication  systems  or 
to  the  achievement  of  a  compatible  and  efficient  interface  between  computer  and 
telecommunication  systems.  In  developing  and  coordinating  these  standards,  a 
considerable  amount  of  effort  is  expended  in  initiating  and  pursuing  joint 
standards  development  efforts  with  appropriate  technical  committees  of  the 
Electronics  Industries  Association,  the  American  National  Standards  Institute, 
the  International  Organization  for  Standardization,  and  the  International 
Telegraph  and  Telephone  Consultative  Committee  of  the  International 
Telecommunication  Union.  This  Technical  Information  Bulletin  presents  an 
overview  of  an  effort  which  is  contributing  to  the  development  of  compatible 
Federal,  national,  and  international  standards  in  the  area  of  Video 
Teleconferencing.  It  has  been  prepared  to  inform  interested  Federal  activities 
of  the  progress  of  these  efforts.  Any  comments,  inputs  or  statements  of 
requirements  which  could  assist  in  the  advancement  of  this  work  are  welcome  and 
should  be  addressed  to: 
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INTRODUCTION  AND  SUMMARY 


1.0 

This  document  summarizes  work  performed  by  Delta  Information 
Systems,  Inc.,  for  the  Office  of  Technology  and  Standards  of  the 
National  Communications  System,  an  organization  of  the  U.  S. 
Government,  headed  by  National  Communications  System  Assistant 
Manager  Dennis  Bodson.  Mr.  Bodson  is  responsible  for  the 
management  of  the  Federal  Telecommunications  Standards  Program, 
which  develops  telecommunications  standards,  the  use  of  which  is 
mandatory  for  all  Federal  agencies.  The  purpose  of  this  study, 
performed  under  Task  number  1,  Modification  P00009  of  Contract 
number  DCA100-83-C-0047,  was  to  determine  the  feasibility  of 
measuring  image  quality  of  video  teleconferencing  systems  using 
objective  rather  than  subjective  procedures. 

The  techniques  for  testing  digital  television  systems  which 
incorporate  signal  processing  for  the  purpose  of  reducing  the 
number  of  bits  which  need  to  be  transmitted  to  define  a  video 
frame  are,  at  this  time,  poorly  defined.  In  general,  purely 
subjective  test  procedures  have  been  used  to  date.  It  would  be 
desireable  to  produce  a  set  of  quantitative  data  which  correlates 
directly  with  a  set  of  qualitative  data.  The  qualitative  data 
will  serve  as  the  initial  criteria  for  the  evaluation  of  codec 
performance.  Analysis  of  the  correlation  between  the  quantitative 
and  qualitative  data  will  then  permit  the  development  of  a  set  of 
quantitative  tests  whose  results  will  serve  as  a  non-sub jective 
standard  for  the  future  evaluation  of  codecs. 


This  study  is  an  important  first  step  towards  the  very 
ambitious  goal  of  establishing  objective  test  methods  for  digital 
video  codecs.  A  first  effort  of  this  type  cannot  be  expected  to 
immediately  accomplish  all  stated  objectives.  However,  it  can  be 
considered  successful  if  it  provides  good  understanding  of  the 
applicable  criteria  and  inherent  problems,  and  clearly  points  the 
way  towards  future  efforts  which  will  fully  accomplish  the  stated 
objectives.  This  report  will  show  that  this  goal  has  been 
achieved . 

Section  2  briefly  reviews  the  previously  performed 
subjective  tests  and  provides  some  additional  analysis  to  convert 
the  results  to  a  format  which  is  more  useful  for  this  study. 
Objective  tests  are  described  and  analyzed  in  Section  3.  It 
contains  the  results  of  test  tape  and  direct  measurements  and 
includes  a  first  attempt  at  objectively  evaluating  motion 
performance.  In  Section  4  the  subjective  and  objective 
measurement  results  are  translated  into  a  common  format  and 
checked  for  correlation.  Section  5  briefly  summarizes  the  program 
and  makes  recommendations  for  future  efforts. 
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2.0  REVIEW  OF  SUBJECTIVE  TESTS 
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2.1  Analysis  of  Paat  Results 

During  1984  and  1985,  DIS  performed  extensive  subjective 
evaluations  of  four  video  codec  models  available  at  that  time  and 
operating  at  1.544  Mbps.  These  tests  are  documented  in  the  Final 
Report  entitled  "Test  and  Evaluation  of  Teleconferencing  Video 
Codecs  Transmitting  at  1.5  Mbps"  which  was  submitted  to  the 
National  Communications  System  on  August  23,  1985.  This  report 
summarizes  the  test  results  on  Table  4-7  which  is  repeated  here 
for  reference  as  Table  2-1.  The  score  comparison  chart  shows 
graphically  both  the  comparative  scores  of  each  codec  pair  and 
the  resulting  mean  values.  The  chart  is  not  to  scale  and  the 
various  values  cannot  add  up  because  they  represent  means  derived 
in  different  steps  from  subjective  test  scores.  However,  the 
results  are  consistent  and  thus  produce  a  high  confidence  in 
their  validity. 

The  most  obvious  impairment  of  codec  performance  is  the 
rendition  of  motion  which  influences  the  subjective  evaluation 
most  heavily.  While  most  analog  performance  parameters  can  be 
measured  on  a  codec  without  difficulty  there  is  as  yet  no 
available  methodology  for  objective  measurements  of  motion 
performance.  It  therefore  cannot  be  expected  to  find  much 
correlation  between  the  subjective  test  results  shown  on  Table 
2-1  and  objective  analog  measurements.  Consequently,  the  content 
of  the  DIS  codec  test  tape  was  reviewed  and  most  sequences  put 
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NEC 

into  on*  of  three  categories,  depending  on  their  motion  content. 
These  categories  are: 

1.  Still  graphics  and  slow  motion 

2.  Lively  motion 

3.  Camera  zooming 

Comparative  scores  for  each  codec  pair,  mean  values  and  codec 
rankings  were  computed  separately  for  each  category,  following 
the  steps  previously  used  in  preparing  Table  2-1.  The  results 
are  shown  on  Tables  2-2,  2-3,  and  2-4.  The  relative  rankings  are 
essentially  just  as  consistent  as  for  the  full  test  tape  but  show 
some  significant  differences.  In  category  1  the  scores  of  CLI, 
NEC  and  GEC  are  very  close,  almost  within  the  expected  margin  of 
error.  Category  2  shows  values  somewhat  similar  to  the  full  test 
tape  while  for  Category  3  the  spread  becomes  much  larger.  This 
is  consistent  with  the  fact  that  camera  zooming  stresses  the 
motion  capability  of  a  codec  most  severely. 

The  slight  inconsistencies  between  relative  and  mean  scores 
on  Tables  2-2  and  2-4  are  not  errors  but  inevitable  small 
ambiguities  caused  by  the  inherent  imperfection  of  subjective 
testing.  The  practical  interpretation  of  the  results  is  that  in 
these  cases  the  difference  between  GEC  and  NEC  is  within  the 
expected  margin  of  error. 

In  all  categories,  the  score  of  Fujitsu  remains  very  low. 

It  became  obvious  that  the  main  problem  of  this  codec  is  its 
motion  capability,  and  this  may  also  affect  its  score  even  in 
Category  1.  All  sequences  are  switched  at  start  and  end,  and 
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■any  contain  a  further  switch  between  two  different  pictures. 

Any  switch  represents  a  kind  of  motion,  and  on  the  Fujitsu  codec 
a  switch  is  reproduced  as  a  vertical  wipe.  This  may  have  been 
annoying  enough  to  cause  a  low  score  by  most  evaluators  even  if 
the  still  picture  itself  is  no  more  than  slightly  impaired. 

Another  factor  affecting  subjective  evaluation  which 
presently  is  unlikely  to  be  identified  objectively  is  artifacts 
which  are  uniquely  caused  by  the  codec  algorithm.  Any  codec 
shows  occasional  spurious  output  phenomena,  such  as  contours, 
stripes,  squares  or  other  patterns  triggered  generally  by  certain 
features  of  the  input  picture.  The  appearance  of  such  artifacts 
can  be  very  annoying  yet  very  difficult  to  predict  or  identify  as 
to  their  causes. 

2.2  Additional  Recent  Tests 

Video  codec  technology  is  in  a  state  of  rapid  development. 
All  the  models  on  which  the  1984  tests  were  performed  are  at  best 
obsolescent  at  this  time.  The  new  models  are  generally 
interoperable  with  the  older  versions.  They  feature  mainly  a 
selection  of  several  lower  data  rates  in  addition  to  improvements 
in  the  coding  algorithms  and  operating  convenience.  The  CE.I 

VTS-1.5E  has  been  replaced  by  the  "Rembrandt",  and  the  NEC  NETEC- 
XI  by  the  NETEC-XV.  GEC  has  no  published  specific  new  model 
designation.  Fujitsu  showed  a  simulation  of  an  improved  codec  in 
1984,  the  design  of  which  has  evidently  been  completed  but  so  far 


this  new  model  has  not  yet  been  made  available  for  independent 
tests. 

During  May  1986  DIS  performed  a  large  number  of  subjective 
codec  tests  for  INTELSAT.  Objective  tests  were  considered  but 
could  not  be  implemented  because  of  time  limitations.  These 
tests  covered  a  wide  range  of  performance  and  data  rates.  In  the 
area  of  full  motion  codecs  three  models  were  available,  namely 
CLI  Rembrandt,  NEC  NETEC-XV,  and  Philips  VCD-2M.  The  latter  unit 
follows  the  European  COST-211  standard  and  is  interoperable  with 
the  GEC  codec.  The  equipments  under  test  operated  at  data  rates 
from  384  Kbps  to  2.048  Mbps. 

A  direct  comparison  between  the  1984  and  1986  tests  cannot 
be  made  because  the  INTELSAT  tests  had  a  different  objective. 
Their  purpose  was  to  evaluate  the  usability  of  codecs  at  various 
data  rates  for  specific  typical  applications  of  digital  TV.  No 
codec  was  to  be  rated  individually,  and  no  comparison  of  the 
performance  of  specific  codecs  was  to  be  made.  The  test  tape 
consisted  mainly  of  selected  scenes  from  the  tape  used  for  the 
1984  tests,  but  re-arranged  and  edited  to  meet  the  INTELSAT 
requirements . 

The  test  results  show  that  all  three  codecs  are  basically 
acceptable  for  the  selected  typical  applications  over  their  full 
range  of  data  rates.  As  expected,  performance  improves  at  higher 
data  rates.  Without  making  an  actual  comparison,  all  experienced 
observers  agreed  that  the  new  codec  models  performed  better  than 
their  predecessors.  The  differences  between  models  tend  to 


become  smaller  but  remain  definitely  noticeable.  Motion 
performance  continues  to  be  the  most  important  factor  in 
assessing  codec  quality.  Thus,  even  without  a  quantitative 
comparison,  the  new  tests  increase  confidence  in  the  validity  of 
the  previously  obtained  data. 


3.0  OBJECTIVE  TESTS 

3.1  Purpose 

Subjective  tests  are  awkward  and  time  consuming  in  terms  of 
both  execution  and  evaluation.  A  large  amount  of  data  is 
necessary  to  achieve  confidence  in  the  results.  Yet,  the  quality 
and  usefulness  of  a  picture  should  ultimately  be  judged  by  the 
viewer.  Analog  broadcast  TV  went  through  many  years  of 
subjective  evaluations  before  it  was  possible  to  establish 
correlation  with  objective  parameters  which  can  be  readily 
measured.  There  now  exist  specifications  and  standards  for  most 
TV  applications  which  give  performance  limits  of  all  pertinent 
parameters  known  to  determine  picture  quality. 

So  far  no  meaningful  objective  tests  for  codecs  have  been 
developed.  Though  many  conventional  analog  tests  can  be  readily 
performed  on  a  codec,  it  has  not  been  determined  how  meaningful 
these  results  are  and  how  they  relate  to  a  subjective  evaluation. 
Correlation  of  these  two  types  of  tests  would  greatly  facilitate 
further  developments  in  the  digital  TV  area.  Some  examples  are 
as  follows: 

o  Optimization  of  the  parameters  of  a  specific  coding 
algorithm. 

o  Comparison  of  the  performance  of  different  coding 
algorithms . 

Performance  monitoring  of  TV  transmission  systems 
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containing  one  or  several  codecs.  Limits  of  acceptable 
system  performance  can  subsequently  be  established. 

3.2  Parameter  Selection 

There  are  three  main  documents  specifying  analog  TV 
parameters.  They  are  EIA  RS-170A,  E1A  RS-250B,  and  NT C  Report 
No.  7.  RS-170A  gives  the  basic  specifications  of  the  signal 

waveform.  RS-250B  and  NTC-7  are  similar  in  content  and  cover  the 
performance  parameters  likely  to  be  affected  by  signal  processing 
and  transmission.  Both  documents  also  give  suggested  measurement 
methods  and  test  signals. 

The  codec  encoder  processes  the  analog  signal  in  a  radical 
fashion  so  that  the  format  of  the  transmitted  compressed  signal 
bears  no  resemblance  to  the  incoming  signal.  The  decoder  re- 
constitutes  the  analog  signal  which  means  that  signal  waveforms 
and  timing  are  generated  there  and  not  directly  influenced  by 
either  the  incoming  digital  signal  or  the  encoding/decoding  and 
transmission  processes.  Therefore,  compliance  with  the  waveform 
parameters  of  RS-170A  is  not  dependent  on  the  encoding  algorithm 
and  thus  not  a  high  priority  item  for  objective  testing.  On  the 
other  hand,  most  of  the  parameters  specified  in  RS-250B  and/or 
NTC-7  may  be  affected  by  the  encoding  algorithm  and  therefore 
should  be  considered  for  an  objective  test  program. 

There  are  other  factors  unique  to  codecs  which  affect  some 
of  the  objective  test  parameters.  Since  codecs  normally  "clip" 
the  transmitted  picture  by  reducing  both  width  and  height,  both 


horizontal  and  vertical  blanking  will  be  intentionally  wider  than 
specified  in  RS-170A.  Parameters  which  are  mainly  affected  by 
certain  factors  typical  of  analog  transmission  become  largely 
irrelevant  in  a  digital  transmission  system.  Non-linear  transfer 
characteristics  and  dynamic  gain  distortions  are  often  caused  by 
limitations  in  FM  detectors  and  low  frequency  response. 

Therefore,  measurements  of  dynamic  gain,  long  time  waveform 
distortion  (bounce) ,  and  use  of  average  picture  levels  (APL) 
other  than  50%  in  differential  gain  and  phase  measurements  become 
unnecessary.  Transmission  noise  of  all  types  is  highly  unlikely 
to  affect  the  received  picture  because  the  decoder  is  tolerant  to 
error  rates  up  to  10“*>  before  forward  error  correction.  The  only 
noise  to  be  considered  is  a  sum  of  quantizing  noise  and 
contributions  from  power  supplies  and  other  portions  of  the 
circuit.  This  noise  level  is  inherently  low.  The  output  level 
of  the  re-constituted  signal  (often  called  insertion  gain),  once 
properly  set,  is  most  likely  to  stay  constant.  Field  time 
waveform  distortion  caused  by  low  frequency  response  limitations 
will  be  low  and  constant.  Therefore,  the  number  of  important 
parameters  for  objective  testing  can  be  considerably  reduced. 

3.3  Measurements 
3.3.1  Video  Tape  Tests 

The  test  tape  prepared  for  the  1984  comparative  codec  tests 
consists  of  two  parts.  The  first  part  contains  the  scenes  for 
strictly  subjective  evaluation  which  were  used  for  the  tests 


described  in  paragraph  2.1.  The  second  part  contains  a  variety 
of  test  signals  to  be  used  partly  for  fully  objective 
measurements  and  partly  for  viewing  by  video  experts  with  the 
anticipation  that  basically  subjective  but  possibly  semi¬ 
objective  results  may  be  obtained.  Both  parts  were  processed 
through  the  four  codecs  under  test  and  the  outputs  recorded  on  1” 
video  tape. 

Table  3-1  gives  the  scenario  for  the  test  signal  portion  of 
the  test  tape.  The  selection  of  the  various  signals  was  based  on 
both  established  practice  and  reasonable  expectations  of  what  may 
be  accomplished.  Sequences  1  to  14  contain  conventional  signals 
largely  used  for  evaluation  and  objective  measurements  of  analog 
TV  signals.  Sequences  15  to  18  contain  artificial  controlled 
motion  and  were  designed  to  implement  initial  attempts  to  develop 
a  methodology  for  objective  measurement  or  semi-objective 
evaluation  of  motion  performance.  Specifically,  sequences  15  and 
16  contain  switching  between  two  radically  different  pictures  for 
the  purpose  of  simulating  fast  motion. 

The  very  straightforward  test  arrangement  is  shown  on  Figure 
3-1.  The  test  tape  to  be  analyzed  is  played  in  the  1"  tape 
recorder  which  is  equipped  with  the  appropriate  time  base 
corrector  and  allows  frame-by-frame  manual  advance.  The  signal 
waveform  is  shown  directly  on  a  waveform  monitor  and  a  vector 
display  is  presented  on  a  vectorscope  when  needed.  The  picture 
is  also  viewed  on  a  high  quality  monitor.  An  oscilloscope  camera. 
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Tektronix  C-4 ,  is  used  to  photograph  selected  waveforms,  vector 
displays  and  monitor  pictures. 

The  results  of  the  tests  did  not  come  up  to  expectations. 
When  measuring  the  test  tape  before  processing  through  a  codec  it 
was  found  that  some  of  the  signals  contained  a  considerable 
amount  of  distortions.  It  cannot  be  determined  after  the  fact 
whether  this  was  due  to  imperfections  in  the  test  signal 
generators  or  to  a  problem  in  the  taping  process.  It  will  be 
shown  subsequently  that  adjustments  of  the  measurements  were  made 
to  achieve  meaningful  results. 

One  important  result  of  the  measurements  was  that  not  all 
test  signals  are  suitable  for  use  with  digital  video  codecs. 
Though  the  encoding  algorithms  differ  between  codecs,  they  all 
have  some  or  all  of  the  features  of  bandwidth  limitation, 
horizontal  and  vertical  sampling  and  sub-sampling,  and 
interpolation.  These  factors  introduce  distortions  such  as 
aliasing  and  full  or  partial  suppression  of  some  test  signal 
portions.  Some  of  these  distortions  can  be  recognized  and 
discarded  but  the  tests  made  it  obvious  that  some  modifications 
of  test  signals  are  necessary  to  make  meaningful  objective 
measurements  on  digital  TV  codecs. 

Most  values  were  read  from  the  waveform  monitor  or 
vectorscope  screens  using  the  standard  graticules.  In  some  Cases 
waveforms  were  photographed  with  the  oscilloscope  camera.  They 
were  color  bar  chart,  differentiated  unmodulated  ramp,  modulated 
ramp  through  a  3.58  MHz  band  pass  filter,  and  video  sweep  at 


vertical  (field)  scanning  rate.  In  addition,  the  picture  of  the 
video  sweep  was  photographed  on  the  monitor  screen.  These 
photographs  are  shown  on  Figures  3-2  to  3-6. 

The  photographs  of  the  color  bar  chart  vector  display  can  be 
used  to  determine  amplitude  and  phase  errors.  The  video  sweep 
display  shows  the  frequency  response  but  great  care  must  be  taken 
to  properly  interpret  the  pattern  because  the  complex  processing 
of  the  codec  produces  aliasing  and  other  spurious  patterns.  The 
photographs  of  the  monitor  screen  help  in  identifying  the  meaning 
of  the  various  portions  of  the  sweep  display.  They  show  that 
signal  amplitudes  appearing  on  the  waveform  monitor  at  and  above 
3  MHz  consist  of  lower  frequencies  produced  by  aliasing  and 
similar  phenomena  cause  by  sampling,  interpolation  and  filtering 
and  thus  do  not  depict  a  real  response.  This  cannot  be  readily 
identified  on  the  waveform  monitor.  Generally  only  the  first 
part  of  the  pattern  with  an  envelope  decreasing  from  a  high  value 
at  a  low  frequency  to  zero  is  a  true  representation  of  the  codec 
response.  The  area  of  this  envelope  gives  a  measure  of  the 
response . 

The  unmodulated  and  modulated  ramp  signals  were  expected  to 
yield  measures  of  quantizing  noise  and  sampling  accuracy. 

However,  these  signals  were  so  much  contaminated  by  spurious 
components  to  make  it  impossible  to  derive  any  meaningful 
quantitative  data  from  these  displays. 

Table  3-2  lists  the  results  that  were  obtained.  It  shows 
the  test  parameters,  test  signals,  and  the  results  of  the 


Figure  3-2  Color  Bar  Chart  Patterns  on  Vectorscope 
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Figure  3-3  Differentiated  Unmodulated  Ramp  on  Waveform 

Monitor 
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Figure  3-4  Modulated  Ramp  through  Band  Pass  Filter 
on  Waveform  Monitor 
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5  Field  Rate  Video  Sweep  on  Waveform  Monitor 
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♦2 
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♦2 
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♦2 

-2 

0 

♦7 
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♦2 

♦2 

♦7 

♦3  — 

♦2 

0 

♦4 
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♦2 

* 2 

♦4 

♦€ 
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—  BURST  SET 
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0 

♦3 

0 

♦2 

0  — 

— 

rt 

-2 

n 

*2 

— 

original  taat  tap*  and  through  the  four  codecs  under  test.  The 
difference  between  the  original  and  processed  tapes  yields  the 
corrected  values  which  will  be  used  in  further  analysis. 

Several  explanatory  remarks  to  Table  3-2  are  necessary.  The 
frequency  response  patterns  shown  on  Figure  3-2  are  difficult  to 
analyze  because  of  aliasing  and  other  distortions  introduced  by 
the  codec  algorithm.  A  numerical  result  was  obtained  as  follows: 


Z  4.2 

1000  /  Yfdf/  /xfdf 

o  /  / 


Z  4.2 

FIGURE  OF  MERIT 
where  X  ■  Sweep  amplitude  at  codec  input 
Y  *  Sweep  amplitude  at  codec  output 
Z  ■  Frequency  of  zero  response  at  codec  output  as 
seen  on  waveform  monitor 

The  integration  is  performed  on  a  point-by-point  basis. 

This  method  of  computation  takes  variations  in  the  input 
signal  into  account.  The  short  time  waveform  distortion 
measurements  are  affected  by  the  limited  frequency  response;  in 
addition,  an  unexplained  contamination  of  the  signal  with  a  low 
amplitude  color  subcarrier  made  measurement  of  overshoots 
impossible  in  several  cases.  Incomplete  measurements  of 
differential  gain  and  phase  are  due  to  the  fact  that  at  an  APL  of 
10%  and  90%  color  subcarrier  exists  only  on  every  fifth  line. 

This  is  not  compatible  with  chrominance  vertical  subsampling  and 
subsequent  interpolation  in  the  codec  resulting  in  a  very  low 
subcarrier  amplitude  and  therefore  a  high  noise  level  which  makes 
meaningful  measurements  impossible. 


3 


17 


3.3.2  Oicect  Measurements 

As  part  of  another  program,  DIS  made  complete  objective 
measurements  on  two  CLI  Rembrandt  codecs.  Though  this  is  a  more 
recent  and  improved  model,  the  changes  in  the  signal  processing 
portion  are  small  and  the  encoding  algorithm  is  identical  with 
the  one  in  the  earlier  VTS-1.5E.  Therefore,  the  test  results  are 
an  applicable  input  to  check  and  verify  some  of  the  other  results 
of  this  program. 

A  block  diagram  of  the  test  setup  is  shown  on  Figure  3-7. 

The  tests  covered  many  more  parameters  than  required  for  codec 
testing  and  still  did  not  come  close  to  utilizing  the  total 
capability  of  the  equipment.  A  minor  limitation  was  due  to  the 
fact  that  all  test  signals  were  those  which  are  conventionally 
used  by  the  broadcast  industry,  meaning  that  multiburst  and 
chroma  pulse  were  not  optimized  for  the  requirements  of  the 
codec.  Each  test  encompassed  a  complete  encoder/decoder 
combination  from  analog  input  to  analog  output  and  included  all 
elements  of  internal  digital  processing. 

The  most  important  element  in  the  tests  was  the  TEKTRONIX 
1980  ANSWER  equipment  which  allows  the  collection  of  large 
amounts  of  highly  accurate  data  in  a  very  short  time.  It 
requires  an  external  display  terminal  for  entering  commands  and 
display  of  the  test  results.  After  the  results  have  been 
reviewed,  they  are  fed  to  the  printer  one  page  at  a  time.  All 
parameter  measurements  are  programmed  in  ANSWER  and  arranged  in 


desirable  groupings  for  ease  in  commanding.  Should  other 
groupings  be  more  convenient,  they  can  easily  be  programmed. 

ANSWER  makes  all  measurements  on  a  single  line,  utilizing 
every  element  of  the  signal  appearing  on  this  line.  In  the  case 
of  the  tests  described  herein,  full  field  test  signals  were  used. 
In  this  case,  any  line  during  the  picture  interval  can  be  used 
for  measurement  with  no  effect  on  the  results,  and  one  line  was 
selected  arbitrarily.  The  test  signals  were  selected  in  the  1910 
Signal  Generator  and  the  ANSWER  Test  Set  was  commanded  to  measure 
the  parameters  which  can  be  handled  by  each  test  signal.  The 
results  were  first  viewed  on  the  screen  of  the  display  terminal 
and  subsequently  printed.  Copies  of  the  printouts  are  shown  on 
Tables  3-3  to  3-6. 

A  review  of  the  test  results  shows  that,  with  the  exception 
of  field  time  waveform  distortion  which  cannot  be  measured  on  a 
single  line  and  was  not  programmed  into  the  available  unit  of 
ANSWER,  all  pertinent  TV  Signal  parameters  have  been  covered.  A 
few  words  of  clarification  are  needed,  mainly  because  the  test 
equipment  was  set  up  for  standard  broadcast  video  performance. 

The  amplitude- frequency  response  is  given  only  as  relative 
amplitudes  of  the  six  frequency  packets  of  the  multiburst  signal 
which  must  be  compared  to  their  original  amplitude  of  60%.  The 
packet  frequencies  are  .5,  1,  2,  3,  3.58  and  4.2  MHz  which  is  not 
compatible  with  the  limitations  of  the  codec  and  gives  only  three 
(3)  useful  points  of  measurement.  The  chrominance  pulse  has  its 
conventional  width  of  12. 5T  instead  of  the  20T  (T  *  125  nsec) 


TEKTRONIX  VIDEO  MEASUREMENTS 
12-NOV-8S  15:26:22  VIOLATED  LIMITS 

LOWER  UPPER 


CHANNEL  A  (S/N  002119)  NTSC 
MEASURING  FIELD  1,  LINE  65 


15:26=29  APL  =  48% 


MULTIBURST  FLAG 

100.9 

IRE 

FCC  MB  PACKET  #1 

53.0 

%  FLAG 

FCC  MB  PACKET  #2 

50.2 

%  FLAG 

FCC  MB  PACKET  #3 

43.7 

%  FLAG 

** 

45.0 

75.0 

FCC  MB  PACKET  #4 

7.1 

%  FLAG 

** 

45.0 

75.0 

FCC  MB  PACKET  #5 

1.3 

%  FLAG 

** 

45.0 

75.0 

FCC  MB  PACKET  #6 

.6 

%  FLAG 

* 

45.0 

75.0 

CHANNEL  B  (S/N  002121)  NTSC  15:27:06 

APL  = 

46% 

MEASURING  FIELD  1. 

LINE  65 

MULTIBURST  FLAG 

100.6 

IRE 

FCC  MB  PACKET  #1 

56.9 

%  FLAG 

FCC  MB  PACKET  #2 

54.8 

X  FLAG 

FCC  MB  PACKET  #3 

50.4 

%  FLAG 

FCC  MB  PACKET  #4 

8.2 

%  FLAG 

** 

45.0 

75.0 

FCC  MB  PACKET  #5 

1.0 

%  FLAG 

** 

45.0 

75.0 

FCC  MB  PACKET  #6 

.3 

X  FLAG 

* 

45.0 

75.0 

(COMMANDS  DONE) 

PAGE , MESURA , COMB , MESURB , COMB 

PAGE, MESURA .COMB .MESURB, COMB  RECEIVED 

TEKTRONIX  VIDEO  MEASUREMENTS 
12-NOV-85  15:28:27  VIOLATED 

LOWER 


LIMITS 

UPPER 


100  IRE  =  714  mV 


100  IRE  =  714  mV 


CHANNEL  A  (S/N  002119)  NTSC  15:28:34  APL  =  56% 
MEASURING  FIELD  1,  LINE  65 


NTC7  20  IRE  CHROMA 

20.2 

IRE 

(REF 

40 

IRE  CHR) 

NTC7  80  IRE  CHROMA 

75.6 

IRE 

(REF 

40 

IRE  CHR) 

NTC7  CHR  NL  PHASE 

2.9 

DEG 

NTC7  CHR-LUM  INTMD 

.8 

IRE 

(REF 

LUM 

PED) 

CHANNEL  B  (S/N  002121)  NTSC  15:29:06  APL 

=  54% 

MEASURING  FIELD  1, 

LINE 

65 

NTC7  20  IRE  CHROMA 

20.8 

IRE 

(REF 

40 

IRE  CHR) 

NTC7  80  IRE  CHROMA 

70.3 

IRE  **  71.5 

88.5 

(REF 

40 

IRE  CHR) 

NTC7  CHR  NL  PHASE 

3.7 

DEG 

NTC7  CHR-LUM  INTMD 

.8 

IRE 

(REF 

LUM 

PED) 

(COMMANDS  DONE) 

TABLE  3-3 

MEASUREMENTS  WITH  COMBINATION  TEST  SIGNAL 


TEKTRONIX  VIDEO  MEASUREMENTS 
12-NOV-8S  17:19:05  VIOLATED  LIMITS 

LOWER  UPPER 


CHANNEL  A  (S/N  002119) NTSC  17:19:12 
MEASURING  FIELD  1.  LINE  65 


BLANKING  LEVEL 

%  CARR 

BAR  AMPLITUDE 

96.3 

IRE 

SYNC  AMPLITUDE 

41.7 

%  BAR 

BLANKING  VARIATION 

.6 

%  BAR 

SYNC  VARIATION 

.  7 

%  BAR 

BURST  AMPLITUDE 

99.0 

%  SYNC 

H  BLANK  4  IRE 

11.91 

USEC 

SYNC  WIDTH 

4.77 

USEC 

SYNC  RISETIME 

95.0 

NSEC 

SYNC  FALLTIME 

90.0 

NSEC 

SYNC-TO-SETUP 

10.06 

USEC 

FRONT  PORCH 

1.31 

USEC 

SYNC-TO-BURST-END 

7.64 

USEC 

BREEZEWAY 

.35 

USEC 

BURST  WIDTH 

9.0 

CYCLES 

EQUALIZER  WIDTH 

49.4 

%  S.W. 

SERRATION  WIDTH 

4.79 

USEC 

V  BLANK  4  IRE  FI 

22.0 

LINES 

V  BLANK  4  IRE  F2 

23.0 

LINES 

LINE  TIME  DIST 

.9 

% 

PULSE/BAR  RATIO 

82.0 

% 

SCH  PHASE 

-71.0 

DEG 

CHROMA- LUM  DELAY 

41.0 

NSEC 

CHROMA- LUM  GAIN 

97.8 

% 

DIFF  GAIN  (DG) 

4.2 

% 

DIFF  PHASE  (DP) 

1.7 

DEG 

LUM  NL  DIST  (DY) 

9.5 

% 

REL  BURST  GAIN 

-7.6 

% 

REL  BURST  PHASE 

-1.1 

DEG 

2T  PULSE  RINGING 

2.8 

%  KF 

2T  BAR  DIST  (LD) 

1.8 

%  KF 

2T  BAR  DIST  (TR) 
(COMMANDS  DONE) 

.8 

X  KF 

API  =  61% 

100  IRE  r  714  mV 

**  10.49  11.16 


* 

1.4 

999. 

99 

** 

.38 

999. 

99 

** 

18.3 

21.1 

** 

18.3 

21.  1 

** 

94.0 

106.0 

* 

-45.0 

45.0 

* 

-40.0 

40.0 

RINGING 

RINGING 


vv 
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MEASUREMENTS  WITH  COMPOSITE  TEST  SIGNAL 
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TEKTRONIX  VIDEO  MEASUREMENTS 
12-NOV-85  17:23:09  VIOLATED  LIMITS 

LOWER  UPPER 


CHANNEL  B  (S/N 
MEASURING  FIELD 
BLANKING  LEVEL 
BAR  AMPLITUDE 
SYNC  AMPLITUDE 
BLANKING  VARIATION 
SYNC  VARIATION 
BURST  AMPLITUDE 
H  BLANK  4  IRE 
SYNC  WIDTH 
SYNC  RISETIME 
SYNC  FALLTIME 
SYNC-TO-SETUP 
FRONT  PORCH 
SYNC-TO-BURST-END 
BREEZEWAY 
BURST  WIDTH 
EQUALIZER  WIDTH 
SERRATION  WIDTH 

V  BLANK  4  IRE  Fl 

V  BLANK  4  IRE  F2 

LINE  TIME  DIST 
PULSE/BAR  RATIO 
SCH  PHASE 
CHROMA- LUM  DELAY 
CHROMA- LUM  GAIN 
DIFF  GAIN  (DG) 

DIFF  PHASE  (DP) 

LUM  NL  DIST  (DY) 
REL  BURST  GAIN 
REL  BURST  PHASE 
2T  PULSE  RINGING 
2T  BAR  DIST  (LD) 

2T  BAR  DIST  (TR) 
(COMMANDS  DONE) 


002121) 
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22 
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84. 

27 
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1. 

18. 

-9. 

-2. 

2. 

1. 

3. 


NTSC 
65 

-  X  CARR 
IRE 
X  BAR 
X  BAR 
X  BAR 
X  SYNC 
49  USEC 
75  USEC 
0  NSEC 
0  NSEC 
47  USEC 
74  USEC 
59  USEC 
42  USEC 
7  CYCLES 
3  X  S. W. 
77  USEC 
0  LINES 
LINES 
X 
X 

DEG 
NSEC 
X 
X 

DEG 

X 
X 

DEG 
X  KF 
X  KF 
X  KF 


17:23:21  APL 


58X 


**  10.49  11.16 


** 

18.3 

21 . 1 

*  * 

18.3 

21  .  1 

** 

94 . 0 

106.0 

0.0 


10.0 


RINGING 

RINGING 


TABLE  3-5 

MEASRUEMENTS  WITH  COMPOSITE  TEST  SIGNAL 
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TEKTRONIX  VIDEO  MEASUREMENTS 
12-NOV-85  15:24:49  VIOLATED  LIMITS 

LOWER  UPPER 


CHANNEL  A  (3/N  002119) 

NTSC  15:24:55 

APL  =  50% 

MEASURING  FIELD  1,  LINE  65 

FCC  COLOR  BARS: 

AMPL  ERROR 

PHASE  ERROR 

CHR/LUM  RATIO 

% 

DEG 

%  NOM 

YEL 

13.3 

-.3 

114.3 

CYN 

6.0 

-3.8 

107.3 

GRN 

10.9 

-1.7 

112.9 

MAG 

10.0 

-3.3 

112.2 

RED 

3.7 

-4.1 

105.5 

BLU 

8.5 

.8 

112.0 

CHANNEL  B  (S/N  002121)  NTSC  15:25:27  APL  =  44% 
MEASURING  FIELD  1,  LINE  65 
FCC  COLOR  BARS: 

AMPL  ERROR 
% 

YEL  5.9 

CYN  -.2 

GRN  2.4 

MAG  5.2 

RED  -.6 

BLU  1.7 

(COMMANDS  DONE) 


PHASE  ERROR 
DEG 


2.5 
-3.2 

1.6 

-.7 

-3.6 

4.2 


CHR/LUM  RATIO 
%  NOM 


108.2 

102.3 

104.4 
108.2 
100.1 
104.4 


TABLE  3-6 


MEASUREMENTS  WITH  COLOR  BAR  CHART 


recommended  for  many  codecs.  The  limits  for  caution  (*)  and 
alarm  (**)  are  set  up  for  broadcast  performance  and  are  of  no 
significance  for  this  program. 

3.3.3  Measurement  Summary 

Table  3-7  gives  the  summary  of  the  measurements  described  in  the 
two  preceding  paragraphs.  It  contains  the  parameters  which  are 
common  to  Table  3-2  and  Tables  3-3  to  3-6.  It  maintains  the 
format  of  Table  3-2  and  all  notes  are  equally  applicable.  The 
values  listed  under  corrected  tape  measurements  are  the  codec 
output  values  minus  the  measurements  on  the  original  tape  as 
listed  on  Table  3-2.  This  compensates  for  the  deficiencies  in 
the  input  signal  and  isolates  the  contributions  of  the  codecs. 
Since  in  many  cases  only  the  absolute  values  are  significant, 
several  minus  signs  have  been  dropped.  The  values  listed  under 
direct  measurements  are  the  averages  of  the  results  obtained  on 
both  codecs  under  test. 

The  codec  output  frequency  response  measurements  have 
already  taken  the  imperfections  of  the  input  tape  into  account 
and  do  not  require  correction.  In  addition,  a  figure  of  merit 
for  the  directly  measured  response  had  to  be  computed  which  was 
done  by  using  the  6  measured  multiburst  frequency  packet 
amplitudes  given  on  Table  3-3.  This  gives  the  result  of  236. 
However,  the  values  at  3  MHz  and  above  are  questionable  and 
probably  invalid  due  to  spurious  responses.  Using  only  the  first 
three  packets  yields  a  figure  of  merit  of  202  which  is  likely  to 


TABLE  2-7  MEASUREMEN7  SUMMARY 
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Zi 
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distortion  k  factor 

mi 

U 

— 

— 

-2/+1 

— 
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•9 
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4 

14 
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ID 
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♦  ^ 
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;> 
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u 
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0 
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A 
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o 

o 

r> 

n 

w 
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7. 

-2 

0 

Y 
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T 
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*2 
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-4 

t3 

MAX.  PHASE  ERROR 
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-4 

+5 
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be  very  close  to  correct  and  matches  the  value  obtained  from  the 
test  tape.  The  standard  multiburst  frequencies  are  too  much 
spread  out  and  do  not  cover  the  range  between  2  and  3  MHz  which 
is  important  to  describe  codec  performance. 

For  luminance  nonlinearity  and  differential  gain  and  phase , 
only  the  values  at  APL*50%  were  used  since  other  values  are 
either  missing  or  questionable  due  to  excessive  noise.  Two 
different  corrected  values  are  given  for  chrominance  nonlinear 
phase.  The  first  value  is  simply  the  result  of  subtraction  of 
the  figures  on  Table  3-2  as  mentioned  above.  The  second  value 
(in  parenthesis)  was  derived  by  first  subtracting  the  individual 
phase  readings  and  thus  achieving  a  corrected  phase  error  for 
each  of  the  3  subcarrier  levels  and  then  taking  the  maximum 
difference  between  these  numbers  as  the  final  result.  These 
values  are  not  shown  on  the  tables  to  avoid  undue  complexity  but 
the  second  set  of  values  is  more  likely  to  be  correct.  At  any 
rate,  all  numbers  are  too  small  to  have  much  impact  on  the  final 
results.  In  vector  accuracy  measurements,  values  were  computed 
for  all  6  color  bars  but  only  the  maximum  error  values  will  be 
used  in  further  considerations. 

Direct  measurements  would  be  the  best  basis  for  further 
analysis,  but  unfortunately  they  are  available  only  for  the  CLI 
Rembrandt  codec  and  therefore  cannot  be  used  for  checking 
correlation  between  objective  and  subjective  tests.  However,  by 
comparison  with  the  corrected  tape  derived  measurements,  they 
enhance  the  confidence  in  the  validity  of  the  measured  data. 


Comparison  with  the  CLI  tape  measurements  shows  reasonable 
agreement  on  many  of  the  most  important  parameters  though  some 
decided  differences  are  obvious.  This  was  to  be  expected  since 
the  tests  were  performed  on  different  though  similar  codec 
models.  In  the  vector  accuracy  measurement  comparison,  the 
agreement  on  the  large  positive  yellow  amplitude  error  is  of 
interest.  During  the  1984/85  subjective  comparative  evaluations, 
a  decidedly  yellow  appearance  of  the  CLI  picture  was  frequently 
noticeable  but  apparently  did  not  influence  the  scoring  of  the 
evalua tors. 

3.4  Motion  Testing 

All  codec  testing  programs  have  shown  clearly  that  for  the 
average  viewer  motion  rendition  is  the  prime  factor  in  judging 
codec  performance.  Therefore,  it  is  highly  desirable  to  devise  an 
objective  method  to  evaluate  motion  performance  but  so  far  this 
has  remained  an  elusive  goal.  The  DIS  codec  test  tape  contains 
two  sequences  designed  especially  for  potential  numerical  motion 
evaluation.  Both  are  based  on  the  fact  that  a  switch  between  two 
radically  different  pictures  producing  abrupt  changes  of  many 
pixels  simulates  rapid  motion.  The  codec  output  is  not  able  to 
immediately  follow  the  input  change.  It  would  be  ideal  if  there 
was  a  measurable  residue  but  for  an  initial  appraisal  of  the 
concept  visual  observation  is  a  sufficient  practical  method. 


The  two  motion  test  sequences  on  the  tape  are  a  switch 
between  a  white  "window"  and  a  black  field,  and  between  a  yellow 


and  blue  field,  at  10  second  intervals.  Both  transitions  appear 
practically  instantaneous  at  the  codec  output  because  the  switch 
between  such  very  simple  images  does  not  strain  the  capability  of 
the  codec  algorithm.  Examining  the  transition  on  the  tape  frame- 
by-frame  showed  no  significant  features  of  the  yellow-blue 
switch.  The  white  window-black  field  switch,  however,  gave  a 
good  indication  that  the  concept  is  viable  and  with  proper 
modif ications  will  achieve  useful  results. 

Following  are  the  observations  of  the  white  window-black 
field  transitions  on  the  four  codec  output  tapes. 

a)  GEC.  The  window  changed  over  3  frames  to  a  mottled  black 
which  took  another  77  frames  to  disappear  completely. 

b)  Fujitsu.  The  window  changed  over  2  frames  to  a  mottled 
black  which  took  another  50  frames  to  disappear  completely. 

c)  CLI .  The  first  2  frames  after  the  switch  contained 
fairly  strong  white  bars  which  disappeared  gradually  after 
another  20  frames. 

d)  NEC.  The  window  changed  after  one  frame  to  a  mottled 
black  which  persisted  for  over  6  seconds  (180  frames) . 

Interpretation  of  these  results  is  not  straightforward. 
Though  the  duration  of  the  after-image  could  be  a  measure  of 
motion  performance,  the  strictly  visual  observation  cannot  put  a 
numerical  value  on  the  equally  important  residual  amplitude.  The 
number  of  frames  necessary  before  the  transition  to  the  after¬ 
image  depends  also  on  the  interpolation  and  frame  repetition 


scheme  of  the  codec  and  thus  is  not  a  valid  measure  of  motion 
performance.  However/  after-image  duration  may  be  used  on  an 
interim  basis  until  a  better  method  is  developed. 


4.0  TEST  DATA  CORRELATION 


4.1  Methodology 

The  available  subjective  and  objective  test  data  give  the 
results  in  different  forms  and  units.  Correlation  can  be 
investigated  only  after  all  data  have  been  reduced  to  a  common 
denominator.  It  was  chosen  arbitrarily  to  normalize  all  test 
results  to  numbers  between  zero  and  one,  with  one  representing 
the  best  result  and  zero  the  worst,  regardless  of  whether  in  the 
actual  measurements  a  higher  or  lower  value  indicates  better 
performance.  Wherever  the  ideal  measured  result  is  a  reading  of 
zero  with  possible  deviations  in  both  directions,  only  the 
absolute  value  of  the  measurement  was  taken  into  account  since 
the  direction  of  the  deviation  is  generally  immaterial.  The  mean 
scores  of  the  subjective  codec  evaluations  were  treated  in  the 
same  manner  as  the  objective  test  results. 

Table  4-1  recapitulates  the  pertinent  data  from  Tables  2-2, 
2-3  and  3-7  and  shows  the  normalized  values  computed  from  them. 
The  category  of  motion  has  been  added,  with  the  objective  results 
based  on  the  number  of  frames  needed  to  fully  complete  a  switched 
transition,  as  described  in  Paragraph  3.4.  Only  parameters  for 
which  complete  data  are  available  have  been  taken  into  account. 

When  reviewing  the  data  it  becomes  apparent  that  not  all 
parameters  are  useful  in  checking  for  data  correlation.  Whenever 
there  are  no  or  only  very  small  objective  performance  differences 
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between  the  codecs/  any  attempt  to  establish  correlation  would 
yield  only  trivial  or  misleading  results. 

4.2  Results 

The  results  of  the  correlation  evaluation  are  shown  on 
Figure  4-1  to  4-10.  All  figures  have  the  same  format.  The 
ordinate  is  common  to  all  figures  and  gives  the  normalized  value 
of  the  subjective  tests.  Still  and  slow  motion  scenes  are  used 
throughout  except  for  Figure  4-10  which  uses  the  values  of  the 
lively  motion  tests.  The  abscissa  gives  the  normalized  value  of 
the  parameter  for  which  correlation  is  being  investigated.  The 
point  of  intersection  of  ordinate  and  abscissa  indicates  the 
amount  of  correlation  of  subjective  and  objective  evaluations  fo 
each  codec.  The  dashed  diagonal  line  is  the  locus  of  all  points 
of  ideal  correlation. 

All  parameters  where  any  amount  of  correlation  appeared 
feasible  were  used  in  the  evaluation.  Reviewing  the  contents  of 
Table  4-1/  only  Line  Time  Distortion,  Chrominance/Luminance 
Intermodulation ,  and  Chrominance  Non-Linear  Gain  have  been 
omitted,  because  they  could  not  yield  significant  results.  All 
other  parameters  have  been  used,  producing  results  of  varying 
significance  and  value. 

4. 3  Discussion 

Review  of  the  results  shows  much  less  correlation  than  was 
generally  anticipated.  Frequency  Response  (Figure  4-1)  is  the 
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Figure  4-2 
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Figure  4-3 
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Correlation  -  Short  Time  Waveform 
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Figure  4-5 

Correlation  -  Differential  Gain 
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Correlation  -  Chominance  Non-Linear  Phase 
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only  parameter  displaying  a  reasonable  though  not  ideal  degree  of 
correlation.  For  some  of  the  other  parameters  the  results  for 
two  codecs  are  fairly  near  the  line  of  ideal  correlation  which 
however  cannot  be  considered  a  significant  result.  In  many 
instances  the  results  are  completely  scattered. 

Though  more  correlation  was  originally  expected,  the 
obtained  results  can  be  explained  and  are  definitely  useful. 
Frequency  response  is  one  of  the  most  important  parameters  in  any 
video  transmission  system,  and  its  effect  is  readily  visible  in 
the  reproduced  picture.  Furthermore,  codec  frequency  response  is 
always  severely  restricted  compared  to  an  analog  system. 
Therefore,  variations  can  be  easily  recognized  even  by  non-expert 
subjective  evaluators.  Limitation  of  frequency  response  affects 
some  other  parameters,  such  as  chrominance/luminance  gain 
inequality  and  short  time  waveform  distortion  which  makes 
correlation  for  these  parameters  less  likely. 

Most  of  the  other  parameters  which  are  generally  measured 
and  specified  in  an  analog  video  system  have  values  within,  or 
near  to,  broadcast  specification  limits  and  do  not  vary  much 
between  different  codecs.  Their  values  seem  to  be  mainly 
determined  rather  randomly  by  incidental  variations  in  the  analog 
reconstitution  circuitry  of  the  decoder,  and  not  by  inherent 
differences  in  the  codec  algorithm.  Furthermore,  the  performed 
subjective  tests  were  not  conducive  to  recognizing  small 
deviations  in  analog  performance  parameters.  For  instance,  as 
mentioned  previously,  the  impairment  in  vector  accuracy  produced 


by  the  CLI  codec,  though  visible,  evidently  was  ignored  by  most 
evaluators.  Evaluation  of  small  differences  requires  comparison 
between  original  and  impaired  picture  by  video  experts  which  is 
the  method  by  which  the  present  broadcast  standards  were 
established.  This,  however,  was  done  with  only  slightly  impaired 
pictures  which  is  not  the  case  when  codec  performance  is  to  be 
evaluated.  Unless  an  analog  parameter  is  severely  degraded,  it 
is  not  likely  to  be  so  recognized  in  subjective  comparative  codec 
performance  evaluation.  Thus  the  subjective  codec  evaluation 
methodology  makes  good  correlation  between  many  subjective  and 
objective  measurements  unlikely.  It  stands  to  reason  that  many 
parameters  which  are  important  for  high  quality  analog  or  digital 
pictures  are  not  significant  in  the  evaluation  of  the  inherently 
degraded  outputs  of  digital  codecs. 

Five  out  of  the  ten  correlation  diagrams  show  the  rather 
disturbing  feature  of  complete  lack  of  correlation,  namely  that 
the  codec  rated  best  subjectively  is  worst  objectively,  and  vice 
versa.  This  makes  the  respective  parameters  poor  candidates  for 
correlation  but  may  also  largely  be  due  to  a  very  limited  range 
of  the  objective  measurement  values  which  tends  to  yield  trivial 
results.  It  is  true  that  in  the  case  of  such  a  small  range  of 
actual  measurements  it  may  not  be  justifiable  to  normalize  them 
over  the  whole  range  from  zero  to  one.  A  range  of,  for  instance, 
.3  to  .7  may  be  more  realistic  and  descriptive  and  would  yield 
better  correlation  but  would  be  completely  arbitrary  and  could 
not  be  firmly  supported. 
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The  above  does  not  apply  to  motion  tests  which  are  unique  to 
codecs.  There  is  no  established  objective  test  method  and 
obviously  no  standard.  As  mentioned  in  Paragraph  3.4/  the 
numerical  values  derived  from  the  very  limited  and 
unsophisticated  initial  tests  that  were  performed  are  no  more 
than  a  first  attempt  to  describe  the  motion  capability  of  the 
codec.  Therefore,  even  the  indication  of  a  very  limited 
correlation  shown  in  Figure  4-10  is  an  encouraging  initial 
result.  A  significant  improvement  can  be  expected  only  after  a 
good  method  for  objective  motion  measurements  has  been  established. 


5.0  CONCLUSION  AND  RECOMMENDATIONS 

5.1  Review  of  Results 

The  main  purpose  of  this  study  was  to  evaluate  the  test 
signal  portion  of  the  DIS  codec  test  tape  and  to  establish 
correlation  between  subjective  and  objective  results.  The 
measurement  on  the  processed  test  signals  ran  into  some 
difficulties  because  the  taped  input  signals  had  some  unexplained 
deficiencies.  Nevertheless,  test  results  could  be  achieved  by 
subtraction  of  measured  input  and  output  values.  The  validity  of 
this  method  could  be  verified  by  comparison  with  highly  reliable 
direct  measurements  on  one  codec  only. 

The  correlation  of  subjective  and  objective  test  results  was 
much  lower  than  anticipated.  Only  the  measured  frequency 
response  correlated  fairly  well  with  the  subjective  ranking  of 
the  codecs.  Most  other  parameters  had  only  rather  small  random 
variations  and  stayed  within,  or  close  to,  standard  analog 
performance  limits.  Thus  these  parameters  were  shown  to  be  of 
lesser  importance  in  describing  codec  performance,  and  had  little 
or  no  correlation  with  subjective  ranking.  This  is  a  significant 
result  because  it  shows  that  the  measured  values  of  such 
parameters  are  not  key  to  the  objective  ranking  of  codec 
performance. 

Motion  performance  of  a  codec  is  the  most  important 


parameter  in  subjective  evaluation.  There  is  no  established 
method  for  objective  motion  measurements  but  initial  ideas  were 


developed  and  incorporated  in  the  DIS  codec  test  tape.  The 
results  showed  that  the  idea  was  feasible  but  that  many 
refinements  of  the  basic  technique  will  be  needed  to  establish 
good  correlation  with  subjective  results.  Achievement  of  the 
stated  goal  of  this  program,  namely  elimination  of  the  need  for 
subjective  testing,  will  largely  depend  on  the  availability  of  a 
good  method  of  objective  motion  capability  measurements. 

5.2  Recommended  Future  Efforts 

The  work  performed  on  this  study  presents  merely  a  first 
attempt  at  recommending  objective  measurements  on  codecs  to 
replace  the  very  cumbersome  subjective  evaluations.  It, 
therefore,  stands  to  reason  that  not  all  objectives  could  be 
fulfilled  but  what  has  been  accomplished  clearly  points  the  way 
towards  necessary  future  efforts. 

Putting  test  signals  on  tape  and  then  processing  the  tape 
through  the  codec  adds  extra  steps  and,  as  shown  by  experience, 
potential  distortions  to  the  objective  measurement  process. 
Meanwhile,  the  convenience  and  accuracy  of  direct  measurements 
with  modern  equipment  has  been  demonstrated.  It,  therefore,  is 
recommended  to  make  measurements  on  all  the  latest  design  codecs 
(possibly  at  more  than  one  data  rate)  using  a  test  setup  similar 
to  the  one  on  Figure  3-7.  This  process  will  provide  more  than  the 
necessary  parameter  measurements  without  extra  effort.  However, 
care  will  have  to  be  taken  that  the  test  signals  are  adapted  to 
the  limitations  of  the  codec.  One  signal  that  definitely 
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requires  modification  is  the  multiburst  which  must  be  limited  to 
the  range  up  to  a  maximum  of  3  MHz  and  contain  frequency  packets 
in  the  (for  a  codec)  very  critical  range  between  2  and  3  MHz.  It 
is  understood  that  such  a  modification  of  the  Tektronix  1910 
Signal  Generator  can  be  accomplished  readily  by  reprogramming  and 
replacing  one  PROM. 

Other  test  signals  not  commonly  used  for  analog  video  have 
found  acceptance  in  the  evaluation  of  high  data  rate  digital  PCM 
and  DPCM  systems.  They  are  a  steep  rise  horizontal  rate  step 
function/  a  ramp  with  a  low  variable  slope  and  variable  setup, 
and  a  flat  field  with  variable  setup.  These  signals  are  used  to 
determine  various  quantizing  distortions.  The  additional 
extensive  processing  in  low  data  rate  digital  codecs  may  often 
completely  overshadow  these  distortions  but  tests  with  such 
signals  are  highly  recommended.  The  signals  are  either  directly 
available  in  the  Tektronix  1910  Generator  or  can  be  produced  with 
minimal  modifications. 

It  has  been  shown  by  many  subjective  tests  that  motion 
rendition  is  the  most  critical  codec  performance  parameter.  Up 
to  now  no  method  of  objectively  measuring  motion  performance  has 
been  established.  This  program  presents  a  first  effort  in  this 
direction.  The  accomplished  results  have  shown  distinct 
differences  between  codecs  but  more  varied  and  complex  patterns 
will  be  needed  to  produce  accurate  and  consistent  numerical 
values.  It  is  recommended  that  a  program  be  initiated  to 
establish  patterns  (in  the  simplest  form  possibly  various  size 


checker  boards)  which  can  demonstrate  the  difference  between 
codecs  not  only  visually  but  also  allow  integration  and 
measurement  of  the  residual  signal  after  switching  and  thus  may 
provide  a  numerical  value  describing  codec  motion  performance. 
The  test  pattern (s)  will  have  to  be  chosen  with  great  care  such 
as  to  not  favor  any  particular  algorithm.  A  successful  program 
in  this  area  will  make  it  possible  to  reduce  the  lengthy 
subjective  evaluation  to  a  brief  objective  measurement. 


